Skip to content

Start slurmd in tpu nodes with -N slurmName #3927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

jvilarru
Copy link
Contributor

Is important that this commit is pushed after the related slurm-gcp commit called "Start docker container with real hostname" is present in the image.

This commit changes how the slurmd process starts in the docker container, now it starts with the real hostname of the TPU, so we need to start slurmd with -N nodename.

Submission Checklist

NOTE: Community submissions can take up to 2 weeks to be reviewed.

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cluster Toolkit Contribution guidelines #

Is important that this commit is pushed after the related slurm-gcp
commit called "Start docker container with real hostname" is present in
the image.
@jvilarru jvilarru requested review from samskillman and a team as code owners April 11, 2025 13:43
@jvilarru
Copy link
Contributor Author

The slurm-gcp PR is GoogleCloudPlatform/slurm-gcp#260

@mr0re1
Copy link
Collaborator

mr0re1 commented Apr 11, 2025

Closed GoogleCloudPlatform/slurm-gcp#260
Waiting for changes to be applied here.

@wiktorn
Copy link
Contributor

wiktorn commented Apr 14, 2025

Closed GoogleCloudPlatform/slurm-gcp#260 Waiting for changes to be applied here.

There is a need to include startup script in the devel.zip, as noted here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants